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Abstract — We consider the problem of learning a non- 
deterministic probabilistic system consistent with a given finite 
set of positive and negative tree samples. Consistency is defined 
with respect to strong simulation conformance. We propose 
learning algorithms that use traditional and a new stochastic 
state-space partitioning, the latter resulting in the minimum 
number of states. We then use them to solve the problem of active 
learning, that uses a knowledgeable teacher to generate samples 
as counterexamples to simulation equivalence queries. We show 
that the problem is undecidable in general, but that it becomes 
decidable under a suitable condition on the teacher which 
comes naturally from the way samples are generated from failed 
simulation checks. The latter problem is shown to be undecidable 
if we impose an additional condition on the learner to always 
conjecture a minimum state hypothesis. We therefore propose a 
semi-algorithm using stochastic partitions. Finally, we apply the 
proposed (semi-) algorithms to infer intermediate assumptions 
in an automated assume-guarantee verification framework for 
probabilistic systems. 

Index Terms — probability, transition, system, simulation, con- 
formance, active learning, tree, partition, assume-guarantee 

I. Introduction 

We study the problem of learning an unknown non- 
deterministic Labeled Probabilistic Transition System (LPTS) 
from tree samples. The motivation for this work was to in- 
vestigate learning techniques for automating assume-guarantee 
style [25] compositional verification of strong simulation con- 
formance [28 1 between LPTSes. Strong simulation for LPTSes 
is decidable in polynomial time [4| and yields stochastic tree 
counterexamples when it fails |fl9l . Stochastic trees are tree- 
shaped LPTSes (see Section HU with probabilities appearing 
on the transitions. 

Compositional verification ifTTI is a promising approach for 
alleviating the state explosion problem in model checking lTT2l . 
Learning from trace Q, ll23l and tree [9| counterexamples has 
been successfully applied before for automating the approach 
in a non-probabilistic setting, for checking trace inclusion [26 1, 
iflOl and simulation conformance [9], respectively. The most 
closely related work |9[ reduces simulation conformance to 
tree language inclusion and uses learning for deterministic tree 
automata to automatically generate the assumptions used in 
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compositional reasoning. In the probabilistic setting, existing 
literature has dealt with learning from samples consisting of 
trees with information regarding the probability of accep- 
tance 0, but learning from stochastic trees has not been 
considered before. Moreover, there is no existing probabilis- 
tic variant of a tree automaton to recognize stochastic tree 
languages. This motivated us to consider learning an LPTS 
directly, without working with tree languages or tree automata. 

We consider first the problem of learning a non- 
deterministic LPTS that is consistent with respect to a set of 
positive and negative stochastic tree samples, where consis- 
tency is defined in terms of strong simulation conformance. 
For the purpose of verification, we want the learnt models to 
be minimal or at least to have a good upper bound on their 
size. We describe two algorithms, each using a different way of 
partitioning the state-space of the positive samples. One algo- 
rithm uses traditional state-space partitioning (Section IIII-AI) 
resulting in the least number of partitions, while the other uses 
a new stochastic partitioning (Section IIII-Bb resulting in the 
least number of states. 

We then apply the above algorithms to solve the problem of 
learning an unknown target in Section [IV] This is done in the 
framework of active learning with the help of a knowledgeable 
teacher. Typically active learning algorithms assume a teacher 
that answers two types of queries - membership (of a sample in 
the unknown target) and equivalence (between the conjectured 
model and the unknown target) [2j. However we observe 
that membership queries are not straightforward to create in 
our case as the learner would need to guess the transition 
probabilities, along with the tree-structure. Therefore, we only 
assume the teacher can answer equivalence queries - the 
teacher checks simulation equivalence (two-way simulation 
conformance) between a conjectured LPTS and the target 
LPTS and returns positive or negative stochastic trees when 
the check fails. 

We show that active learning for LPTSes is undecidable 
in general. We then propose a learning algorithm that works 
under an assumption on the teacher which comes naturally 
from the way the tree counterexamples are generated from 
failed simulation checks. As we are interested in learning an 
LPTS of the least number of states, we also consider imposing 
a restriction on the learner to always conjecture a minimum 
state hypothesis. Learning with this restriction also turns out 
to be undecidable and we propose a semi-algorithm using 
stochastic partitions. 



LPTSes are related to probabilistic automata (PA) (27). 
Algorithms to learn PAs have only been proposed in restricted 
settings of stronger assumptions on a teacher ||29l or approxi- 
mate learning [13|, [21 ]. Algorithms to learn a multiplicity au- 
tomaton, which generalizes a PA by replacing the probabilities 
with arbitrary rationals, have also been proposed [5 ]. Adapting 
these to solve verification problems involving probabilistic 
transition systems is difficult and results in non-terminating 
algorithms fl4l . On the other hand, we show in Section [V] 
that one can readily apply the algorithms we propose to infer 
intermediate assumptions in an automated assume-guarantee 
style framework for the verification of strong simulation 
conformance between LPTSes. This yields the first complete 
and fully automated learning framework for compositional 
verification of probabilistic systems. Moreover, one can ex- 
tend this framework to check logical properties, such as the 
fragment weakly safe PCTL [8], which are preserved by the 
conformance and also have tree counterexamples. 
Other Related Work. Learning for automating compositional 
reasoning of probabilistic systems has been proposed be- 
fore [15 1 in the context of checking probabilistic reachability 
properties, which are refuted by sets of trace counterexamples. 
The approach uses a variant of L* Q, a learning algorithm 
for DFAs, to automatically learn deterministic assumptions, 
following previous work in the non-probabilistic setting |26|. 
The approach uses a sound but incomplete rule, and therefore, 
it is not guaranteed to terminate (completeness is necessary for 
termination). A complete rule for such properties restricted 
to systems without non-determinism has been considered 
recently 0141 . It uses learning with probabilistic trace inclusion 
as the conformance relation which is undecidable. Also, the 
learning algorithm is not guaranteed to terminate. In contrast, 
we use simulation conformance which is decidable in polyno- 
mial time and leads to a sound and complete rule (Section fVb- 
We are also able to guarantee termination for the algorithm 
proposed in Section [VI when using classical partitions to infer 
a consistent LPTS. 

Our work draws inspiration from a previous work [18| that 
automates assumption generation by using an algorithm for 
learning the minimal separating automaton from positive and 
negative trace counterexamples. The counterexamples are pro- 
vided via model checking in an assume-guarantee framework. 
Similar to our work, they use a partitioning approach, where 
the goal is to find a folding of the counterexamples into the 
learnt model. A different approach has been proposed to find 
the separating automaton based on L* which makes use of 
membership queries, in addition to equivalence queries |10|. 
All these works were done in the context of non-probabilistic 
reasoning under trace semantics and thus, are different from 
our setting. 

Learning a minimum-state automaton from positive and 
negative samples is a well studied problem [3|, [24|, [16] that 
is known to be hard [17|. Algorithms have also been proposed 
for samples with stochastic information, i.e. the probability of 
acceptance of a trace or a tree J6], Q, learning stochastic 
finite (tree) automata. As also previously said, we cannot 




Fig. 1: Three reactive LPTSes. p g (0, 1) for C p . 



immediately borrow existing results from the above automata- 
theoretic approaches. 

II. Preliminaries 

Labeled Probabilistic Transition Systems. Let S be a non- 
empty set. Dist(S) is defined to be the set of discrete proba- 
bility distributions over S. We assume that all the probabilities 
specified explicitly in a distribution are rationals in [0, 1]; 
there is no unique representation for all real numbers on a 
computer and floating-point numbers are essentially rationals. 
For s G S, S s is the Dirac distribution on s, i.e. S s (s) = 1 and 
5 s (t) = for all t ^ s. For fj, £ Dist(S), the support of p, 
denoted Supp(n), is defined to be the set {s e S\fi(s) > 0} 
and for X C S, ^(X) stands for J^sgx ^( s )- Th e models 
we consider, defined below, have both probabilistic and non- 
deterministic behavior. Thus, there can be a non-deterministic 
choice between two probability distributions, even for the same 
action. Such modeling is typically used for underspecification. 
Moreover, the theory described does not become any simpler 
by disallowing non-deterministic choice for a given action (see 
the discussion on counterexamples at the end of this section). 

Definition 1 (LPTS). A Labeled Probabilistic Transition Sys- 
tem (LPTS) is a tuple (S, s°, a, r) where S is a set of states, 
s 6 S is a distinguished start state, a is a set of actions and 
rCSxax Dist(5) is a probabilistic transition relation. For 
s S S, a € a and p 6 Dist(5), we denote (s, a, p) G r by 
s — > p and say that s has a transition on a to fi. 

An LPTS is called reactive r is a partial function from 
S xa to Dist(S') (i.e. at most one transition on a given action 
from a given state). 

Throughout this paper, we use filled circles to denote start 
states in the pictorial representations of LTPSes. For example, 
Figure Q] shows three LPTSes. For \i — {(si, |), (s2, h)}, 
L\ has the transition s\ A fi. All the LPTSes in the figure 
are reactive as no state has more than one transition on a 
given action. In the literature, an LPTS is also called a simple 
probabilistic automaton [28 1. Similarly, a reactive LPTS is also 
called a (Labeled) Markov Decision Process. Also, note that an 
LPTS with all the distributions restricted to Dirac distributions 
is the classical (non-probabilistic) Labeled Transition System 
(LTS); thus a reactive LTS corresponds to the standard notion 
of a deterministic LTS. We only consider finite state, finite 
alphabet and finitely branching (i.e. finitely many transitions 
from any state) LPTSes. We use (Si,s®,cti,Ti) for an LPTS 
Li and (Sx, s° , «l, t~l) for an LPTS L. 

We are also interested in LPTSes with a tree structure, i.e. 
the start state is not in the support of any distribution and every 
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Fig. 2: A simple example where matching probabilities (solid edges) directly 
proves /xi C fl /i 2 - 



other state is in the support of exactly one distribution. We call 
such LPTSes stochastic trees or simply trees. For example, C p , 
p G (0, 1), in Figure Q] is a tree. 

Strong Simulation. In the non-probabilistic case, for two 
labeled transition systems (LTSes), a pair of states belonging to 
a strong simulation relation depends on whether certain other 
pairs of successor states also belong to the relation [22J. For 
LPTSes, one has successor distributions instead of successor 
states; a pair of states belonging to a strong simulation relation 
R should now depend on whether certain other pairs in the 
supports of these successor distributions also belong to R. 
We thus need a binary relation between distributions, 
which depends on the relation R between states. Intuitively, 
two distributions can be related if we can pair the states in 
their support sets, the pairs contained in R, matching all the 
probabilities under the distributions. 

Consider an example with sRt and the transitions s — > fi% 
and t — ¥ p,2 with fix and fi 2 as in Figure [2] In this case, one 
easy way to match the probabilities is to pair s\ with t\ and 
s 2 with t 2 . This is sufficient if s\Rt\ and s 2 Rt 2 also hold, 
in which case, we say that /.ii \Z R /j 2 . However, such a direct 
matching may not be possible in general. As shown in Figure 
[3] we need a more general notion of matching the probabilities. 
One can achieve that by splitting the probabilities under the 
distributions in such a way that one can then directly match 
the probabilities as in Figure Now, if s\Rt\, s\Rt 2 , s 2 Rt 2 
and s 2 Rtz also hold, we say that Mi \Z R [i 2 . Note that there 
can more than one possible splitting. 

This is the central idea behind the following definition 
where the splitting is achieved by a weight function. For 
the rest of the section, let L\ and L 2 be two LPTSes, 
/ii £ Dist(Si), [ii £ Dist(S 2 ) and R C Si X S 2 . 

Definition 2 ([28 1). mi ^=r M2 iff there is a weight function 
w : Si x S 2 -» Q (~1 [0, 1] such that 

1) Mi( s i) = E S2 es 2 w ( s i' s 2) for all s x £ Si, 

2) M2(s2) = J2 Sl eSi w ( Sl ' S2 ) f or al1 S2 G S2 ' 

3) w(s\, s 2 ) > implies siRs 2 for all s\ £ S±, s 2 £ S 2 . 

Mi Efl M2 can be checked by computing the maxflow in 
an appropriate network and checking if it equals 1.0 [4|. If 
Mi Efi M2 holds, w in the above definition is one such maxflow 
function. As explained above, /ii Qr \i 2 can be understood 
as matching all the probabilities (after splitting appropriately) 
under \i\ and ji 2 . Considering Supp(ni) and Supp(fi 2 ) as two 
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Fig. 3: An example where probabilities are split (arrows) before matching 
(solid edges) to prove fj,± C^j fi2- 

partite sets, this is the weighted analog of saturating a partite 
set in bipartite matching, giving us the following analog of the 
well-known Hall's Theorem for saturating Supp(ni). 

Lemma 1 (J30)). Mi Qr M2 iff for every S C Supp(/xi), 

Mi(^) < MS)). 

It follows that when [i\ %r fi 2 , there exists a witness 
S C Supp(ni) such that ni(S) > fi 2 (R(S)). For example, 
if R(s 2 ) — in Figure |2] its probability | under pb\ cannot 
be matched and S = {52} is a witness subset. 

Definition 3 (Strong Simulation 11281 ). R is a strong simu- 
lation iff for every S\Rs 2 and S\ — > nf there is a fi 2 with 

s 2 — > M2 an d Mi E_r Mi- 

For S\ £ Si and s 2 £ S 2 , s 2 strongly simulates s\, denoted 
si d S2, iff there is a strong simulation T such that s\Ts 2 . L 2 
strongly simulates L\, also denoted L\ ~< L 2 , iff ^ s 2 . For 
the latter, alternatively, we say that simulation conformance 
holds between L\ and L 2 . 

Definition 4 (Strong Simulation Equivalence). The strong 
simulation equivalence, denoted ~, is defined as the kernel 
of strong simulation, i.e. PI ^. 

Definition [3] generalizes the one in the non-probabilistic 
setting 1 22 J and has the following immediate consequence. 

Lemma 2. ^<C Si x S2 is the coarsest strong simulation, i.e. 
^ is a strong simulation and contains every strong simulation. 

Simulation conformance is decidable in polynomial time [4| 
and can be checked with a greatest fixed point algorithm that 
computes the coarsest simulation between L\ and L 2 . The 
algorithm uses a relation variable R initialized to Si x S2 and 
it checks the condition in Definition [3] for every pair in R, 
iteratively, removing any violating pairs from R. The algorithm 
terminates when a fixed point is reached showing L\ < L 2 
or when the pair of start states is removed showing L\ L 2 . 
Several optimizations exist [ 30 1 but we do not consider them 
here, for simplicity. 

Lemma 3 ( 0281 ). ^ is a preorder (i.e. reflexive and transitive). 
Finally, we find the following characterization of < useful 




Fig. 4: An example showing that Lemma [4] does not hold, in gen- 
eral, if Li is not a tree. Let R = {(si , t\), (s 2 , ^2)}- Note that ;<= 
{(s 1 ,t 1 ),{s 2 ,t 2 ),(s2,t 3 )} and R C< 

in the algorithms we will discuss later on. 

Lemma 4. Let L\ be a tree and s\Rs2 iff for every s\ A 
there exists S2 — > ^2 with fi± \i2- Then, R =^. 

Proof Sketch: R by Def. [3] R can be proved by 
induction on the height of a state of L\ using Lemma [2] ■ 
Note that the condition on R in the lemma is stronger than 
the one to make it a strong simulation (Definition [3}. Also, if 
L\ is not a tree, we can only conclude that R C^, in general. 
See Figure [4] for an example where Rc^<. 
Counterexamples to X. In the active learning problem we 
are interested in (Section HVl l. a learner uses counterexamples 
to simulation conformance as diagnostic information. We will 
now briefly discuss what these counterexamples are. Let L\ 
and L2 be two LPTSes. 

Definition 5 (Language of an LPTS). Given an LPTS 
L, we define its language, denoted C{L), as the set 
{L'\L' is an LPTS and L' < L}. 

Lemma 5. Li < L 2 iff C(Li) C C{L 2 ). 

Proof: Necessity follows trivially from the transitivity of 
< and sufficiency follows from the reflexivity of -< which 
implies L\ E C{L\). ■ 
Thus, a counterexample C can be defined as follows. 

Definition 6 (Counterexample). A counterexample to L\ -< L2 
is an LPTS C such that C E £(Li) \ £(£2), i.e. C <L\ but 

Now, Lx itself is a trivial choice for C but it does not give 
any more useful information than what we had before checking 
the conformance. Moreover, it is preferable to have C with a 
special and simpler structure to efficiently work with coun- 
terexamples. Fortunately, we have a simpler characterization 
using trees. 

Theorem 1 (|19|). If L\ ^ L 2 » there is a tree which serves 
as a counterexample. 

Proof Sketch: One can instrument the algorithm to 
compute the coarsest strong simulation described earlier to 
obtain a tree counterexample whenever a pair of states is 
removed from the current relation, making use of Lemma Q] 

■ 

For example, C v in Figure Q] for p E (0, |], is a counterex- 
ample to L\ < L2- In another work, we showed that structures 
simpler than trees are not sufficient as counterexamples, even 
when one of the models is reactive |fl9l . 



We note an important feature of the algorithm used to prove 
the above theorem fT9| . A counterexample C generated by 
the algorithm is essentially a finite tree execution of L\. That 
is, there is a total mapping M : Sc — > Si such that for 
every transition c A fi c of C, there exists M(c) A fii such 
that M restricted to Supp(^, c ) is an injection and for every 
d E Supp(/j, c ), /ic(c') = /ii(M(c')). Note that M is also 
a strong simulation. We call such a mapping an execution 
mapping from C to L\ in the rest of the paper. An execution 
mapping is shown in brackets beside the states of C p for 
p = \ in Figure [T] While our algorithm always generates 
counterexamples with an execution mapping, it is possible 
to have a tree counterexample, as per Definition [6] without 
such a mapping. For example, C p in Figure Q] for p E (0, j) 
is also a counterexample with no such execution mapping. 
The condition we impose on a teacher in the active learning 
problem (Section HvT > is regarding this execution mapping. 

III. Learning a Consistent LPTS 

We are interested in the problem where we are given a 
finite set of positive stochastic trees (i.e. in the language of 
an LPTS), say V, and another finite set of negative stochastic 
trees (i.e. not in the language of an LPTS), say Af. These trees 
constitute the samples for a learner. The goal is to learn an 
LPTS L such that V C C(L) and Af n C(L) = 0, i.e. P <L 
for every P E V and N ^ L for no N E Af. Such an L 
is said to be consistent with the tree samples. Without loss 
of generality, assume that V ^ as otherwise, a single state 
LPTS with no transitions is trivially consistent. Also, note that 
the LPTS obtained by merging the start states of all trees in 
V, say L-p, trivially satisfies P ^ L-p for every P E V. Now, 
if L is a consistent LPTS, it can be shown that L-p L and 
hence, by Lemma [3] L-p is also consistent. Thus, one can 
easily check, in polynomial time, if there exists a consistent 
LPTS by checking N -< Lp for every N E Af. For this reason, 
we always assume the existence of a consistent LPTS. Clearly, 
the size of Lp is as large as that of V. 

If possible, we would like to learn a model with the least 
size, or at least have a good upper bound on its size. Such 
models would be useful when automating assume-guarantee 
reasoning (see Section [V}. The algorithms we propose draw 
inspiration from the ones used to infer consistent non- 
probabilistic automata from counterexample traces |24| . fl6l . 
1 6 1, 1 18] which are based on partitioning the state space of the 
counterexamples. Let Sp — {J Pe p Sp and SV = UneN' ^n- 
First, we consider an algorithm based on the traditional state 
space partitioning of Sp. While there is an upper bound on 
the size of the learnt model, we show that such partitioning is 
insufficient to obtain a minimum state consistent probabilistic 
system (LPTS). However, as we will see in Section lTVl we find 
it useful in learning an unknown target LPTS. We will then 
introduce a new way of partitioning the state space, which we 
call stochastic partitioning, enabling us to obtain a minimum 
state consistent LPTS. 
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Fig. 5: Positive (P) and negative (N a , Nt, N^'^) tree samples. 

A. Using State Partitions 

The first algorithm uses traditional partitions of S-p. For 
a partition II of S-p, let En denote the set of equivalence 
classes under II and for a state s G S-p, we let [s]n denote 
the equivalence class of s (we drop the subscript II when it is 
clear from the context). We always assume that [s° P ]u = [sg]n 
for every P, Q G V, i.e. the start states of all the positive 
counterexamples are mapped to the same equivalence class. 

Definition 7 (Quotient LPTS). Given a partition II of 
Sp, define the quotient LPTS, denoted V /II, as the LPTS 
(En, e , a, r) where e° = [s P ]u far every P G V, a = 
{J PGP ap and (e, a, /u) G r iff there exists (s,a,[i p ) G rp 
/or .some P e V with [s]u = e such that /i — lift(/i p ) vv/zere 
lift(/ip)(e') = J2 s > ee > MpO»0 / or fl// e ' e s n- 

It can be easily shown that a quotient is always a well- 
defined LPTS. In the following, II is a partition of Sp. 

Lemma 6. V/H is consistent with V for all II. 

Proof Sketch: One can show that {(s, [s]n)|s € Sp} is 
a strong simulation between P and "P/II for every P G "P. ■ 

Definition 8 (Consistent Partition). II is defined to be con- 
sistent iff V /W is consistent with TV, i.e. for every N 6 M, 

n i, v/n. 

Thus, we reduce the problem of finding a consistent LPTS 
to that of finding a consistent partition. As we show below, 
we can always find a consistent partition with a bounded size, 
where the size of II is |-Eji|. 

Lemma 7. // L is an LPTS of k states consistent with V, then 
there is a II of size at most 2 k such that V /II ^ L. 

Proof Sketch: Let P G V. As P < L, there is a strong 
simulation Rp C Sp x Sl with SpRps^. As P is a tree, s P 
is not in the support of any distribution and hence, assume 
without loss of generality that R P (s p ) = {s°}. Let R = 
[Jp e p Rp- Now, R induces a partition II of Sp such that 
for si,s 2 G S-p, [si] n = [s 2 ]n iff R(si) = R{s 2 ). Note that 
[ s p]n = [ s q]ii for P,Q E~P. The size of II is clearly bounded 
by 2 k . Now, we can show that {([s p ]n, si)\s p Rsi} is a strong 
simulation between V/H and L. ■ 
Note that, if L and every P G V is an LTS, an upper bound 
of k on the size can be shown by choosing Rp in the proof to 
be a function. The following is now immediate, using Lemmas 
[3]and|6] 

Corollary 1. For every consistent LPTS of k states, there is 
a consistent partition of size at most 2 k . 





X G (0, 1) 



Fig. 6: Quotients for least size partition (Hi) and stochastic partition (H\) 
of P in Figure \5\ 



Observation. This shows that if L is a minimum state consis- 
tent LPTS, there exists a consistent partition of Sp of size at 
most exponential in \Sl\- While there may be a better bound, 
this way of partitioning Sp can not guarantee a minimum state 
consistent LPTS in general. For example, Hi in Figure [6] is 
the quotient for a least sized consistent partition of P for the 
trees in Figure [5] (obtained by merging S3 and S4). On the other 
hand, H\, where A is any value in (0, 1), is another consistent 
LPTS with one less state. 

Algorithm. A naive algorithm for finding a least-sized con- 
sistent partition is to enumerate all the partitions of Sp, 
with increasing size, and for each of them, check if the 
corresponding quotient simulates any tree in TV. Alternatively, 
we can cast it as an instance of the satisfiability problem 
over linear rational arithmetic, as shown below. In general, 
this is more efficient than the exhaustive search in the naive 
algorithm, and also prepares the ground for an algorithm we 
discuss in the next subsection. 

First, we describe the encoding to check if there is a 
consistent partition of size at most a given k. Let denote the 
equivalence class i for 1 < i < k. For each i and state s G Sp, 
we introduce a new boolean variable, say W[ s ] = i, to denote 
[s] = ei. We add the constraint xor(u[ s j_i, . . . , U[ s ] = fe) for 
every s G Sp for the partition to be well-defined. Moreover, 
we fix ei to be the start state of the resulting quotient and 
have a constraint that U[ s o] =1 for every P G V as ei should 
now contain all the start states (Definition [7]). 

Now, to encode consistency, we want to say that no tree 
G TV is simulated by the resulting quotient. We can 
avoid introducing a universal quantification over all possible 
strong simulations by finding a way to say that (s^,ei) is 
not in the coarsest strong simulation, for every G TV. 
Fortunately, we can make use of Lemma [4] to achieve exactly 
this. We introduce a boolean variable R s i to denote that 
s G S_\f is related to by the coarsest strong simulation. 
Let t n = (s„,a,/i„) and t p — (s p ,a,/j, p ) be a transition of 
TV and V, respectively, on the same action a, and 1 < i < k. 
Consider the expression d^ n , Pp A V[ Sp ] =i , denoted a tn : i ,t p - If 
dfj, n ,n P denotes /i„ Cp lift(fi p ), then this expression has the 
meaning that [s p ] = ei and the transition corresponding to 
t p in the quotient, viz. ei A lift(ii p ), simulates t n . If X(s) 
denotes the set of all transitions outgoing from s G Sj\r, Y(a) 
denotes the set of all transitions in V on action a and act(t) 



denotes the action for the transition f, we add 

R s.i <=^ A V a t n ,i,t p 

t n EX(s) t p £Y(act(t n )) 

according to Lemma [4] 

lift(n p )(e i ) can be encoded as E se s w ( Mj) ) where 
j , denotes the contribution of s to the lifted probability of 
e, under /i p and satisfies 

(«[«]=i / Mp ,i, s = M P (s)) A (-W[ a ] =j => i^,,> = 0). 

dfj, n ,fi p is encoded as follows. If we use Definition |2] alone, we 
need to introduce a nested existential quantifier for the weight 
function (to say that rf Al „ lAt iff there is a weight function 
satisfying the conditions). To avoid this nested quantification, 
we also make use of Lemma Q] First, we introduce a vari- 
able for the weight function and encode the constraints of 
Definition [2] if Cp holds between the distributions. We also 
introduce a variable for the witness subset S C Supp(fi p ) and 
encode the condition of Lemma Q] when Cjj fails to hold. 
This variable for the witness subset can, in turn, be encoded 
using individual boolean variables for each s € Supp(fi p ). 
We also need boolean variables for the image of this witness 
subset under R. The details are straightforward and left to the 
reader. Finally, we encode consistency by having the constraint 
->ii s o i for every N e TV. 

It is not hard to show that the encoding is correct, i.e. 
the resulting encoding is satisfiable iff there is a consistent 
partition of size at most k. One can then obtain an algorithm 
to find a least-sized consistent partition by starting with 
k = and incrementing it as long as the encoding for k 
is unsatisfiable. As satisfiability over linear rational arithmetic 
is decidable, this is guaranteed to terminate from Corollary Q] 

Theorem 2. The above described algorithm to find a least- 
sized consistent partition of S-p terminates. 

B. Using Stochastic Partitions 

As noted above, the quotient of a least-sized consistent 
partition need not have the least number of states. We observe 
that the main reason for this is not being able to partition 
S-p such that there is a one-to-one correspondence between 
the equivalence classes and Sl, instead of the current 2 Sl 
for a consistent LPTS L (proof of Lemma |7J. This suggests 
that we can learn a minimum state consistent LPTS if we 
can find a way to group the states of Sp (groups need not 
be disjoint) with such a correspondence. This will then imply 
that if there is a minimum state consistent LPTS L, we can 
use this grouping to obtain an equally sized consistent LPTS. 
One can then automate the search for such a grouping using 
constraint solving. 

Let I be a consistent LPTS and let us see what we can 
do to group Sp to have the above one-to-one correspondence 
with Sl- Consider Figure [3] again and let fj,% be outgoing from 
the root of some tree P in V and /12 appear in L. Let there 
be three groups (initially empty), one per state in Supp(/j, 2 ), 
say Gt 1 , Gt 2 and Gt 3 . As explained in Section [TTJ having 



I 1 ! Efi H2, for some R, can be thought of as finding a way of 
splitting the probabilities in both the distributions and pairing 
states, already in R, to directly match the probabilities. We 
would like to use this matching to group the states of S-p. In 
particular, looking at the figure, we would like to place the two 
splits of si (s 2 ) in Gt 1 and G t2 (G t2 and G* 3 ), respectively. 

As the probability of each split of a state in Supp(fj,\) 
is matched with that of some split of exactly one state in 
Supp(/i 2 ), one can also think of the above grouping in the 
following alternative way. As the probability of | for si is 
split into i and ~, si can be seen as being put in Gt 1 with 
probability j4| = | and in Gt 2 with probability i4| = |. 
Thus, instead of putting si deterministic ally into one group, 
it is put stochastically into multiple groups. Let these splits of 
si put in Gt 1 and G* 2 be si[ii] and si^], respectively. 

Now, consider si[ii]. As the corresponding probability of | 
is matched with that of some split of t\ (implying s\Rt\), and 
as si is not in the support of any distribution other than p,\ 
(note that P is a tree), we need not consider if si is related, 
by R, to any other state in L, as far as si[ti] is concerned. 
And therefore, any distribution outgoing from this split of si 
will only need to be related to some distribution outgoing 
from t\ (by Qr). Similarly, for Si^] and t 2 . Now, if /X3 
is a distribution outgoing from si in P, we may want to 
relate it to a distribution p, outgoing from ti (for si[ti]) and 
another distribution p! outgoing from t 2 (for si[ta]). For a 
state s 3 G Supp(fi 3 ), considering ^3 Cp p and /i 3 Cp fi' 
both hold, following the above described stochastic grouping 
may result in two different ways of grouping S3. Thus, we 
need to remember the group of its parent, denoted by par(-), 
when grouping a state in S-p. 

This is the main motivation behind a stochastic partition, 
which is defined below. 

Definition 9 (Stochastic Partition). A stochastic partition of 

S-p is a tuple (G, {[s]} s gs p ) where G C 2 Sv and [s] : G — >• 
Dist(G) for every s S S-p, such that (J G = S-p and 

1) there is a g° 6 G such that for every P € V and g G G, 
[sp](g) = 8 g o and 

2) for every non-root state s £ S-p and g G G, [s](<?) is 
defined iff \par(s)](g')(g) > for some g' G G. 

Furthermore, s G g iff [s](g')(g) > for some g' G G, for 
every s G S-p and g G G. 

We use (Gn, {[s]n}s) for a stochastic partition II and when 
IT is clear, we drop the subscripts. 

Here, G denotes the groups mentioned above and [s] denotes 
the stochastic grouping of s G S-p given a group of its parent. 
Point 1 above says that the start states of all trees in V go 
deterministically to a designated group. Note that the start 
states have no parents and the dependence of [sp] on an 
argument is just a notational convenience. And point 2 says 
that for every non-root state s, [s] is only defined for a valid 
group of its parent. We implicitly assume that [s](g')(g) = 
for every g G G if [s] is not defined at g' . 

Now, we define the quotient of a stochastic partition in the 
following way. 



Definition 10 (Quotient LPTS). Given a stochastic partition 
IT = (G, {[s]} s ) of Sp, define the quotient LPTS, denoted 
■p/IL as the LPTS (G,g ,a,T) where g° G G is such that 
[s%](g) — S g o for every P G V and g £ G, a = IJpgp a P 
and (g,a,fi) G r iff there exists (s, a, Hp) G Tp, for some 
P G V such that s G g and for every g' G G, 

Kg')- $>'](<?)(</) -MpOO- 

s'Eg' 

We denote this relation between \i and [i p by fi — lift(fi p , g). 

Thus, (<7, a, fi) G r iff there is a state s G ,g with s /i p and 
/i is obtained by lifting n p , given that s G ,g. For this to make 
sense, we need to show that the lifting is a valid distribution. 
In the following, II = (G, {[s]} s ) is a stochastic partition. 

Lemma 8. V/U is a well-defined LPTS. 

We have the following lemma analogous to classical parti- 
tions. 

Lemma 9. V/H is consistent with V for all IT. 

Proof Sketch: One can show that {(s,g)\g G G, s G 
Spflg} is a strong simulation between P and V/H for P G "P. 

■ 

Consistency of a stochastic partition is defined in the same 
way as Definition [8] Thus, we reduce the problem of finding a 
minimum state consistent LPTS to that of finding a least-sized 
consistent stochastic partition where the size of a stochastic 
partition is its number of groups. 

Lemma 10. If L is an LPTS of k states consistent with V, 
then there is a II of size at most k with V /IT < L. 

Proof Sketch: Let P G V. As P < L, there is 
a strong simulation Rp G Sp x Sl with s° p Rps L . Let 
R = Upg-p Rp- Now, construct a stochastic partition with at 
most | Sl | many groups following the intuitive explanation we 
gave when motivating stochastic partitions. For distributions 
H p G Dist(Sp) and p,i G Dist(SL), the stochastic groupings 
of a state s G Supp{ii p ) is obtained by using a weight function 
showing fi p \Zp In particular, s is put in the group 
corresponding to s; G Sl with probability w(s, si)/ /i p (s) 
where w is the weight function which is uniquely chosen given 
H P and /i;. Moreover, /i; and this grouping depend on the 
group of par(s). Once such a stochastic partition IT is built, 
we can show that {(g, si)\g is the group corresponding to si} 
is a strong simulation between V /H and L. ■ 
Our main result follows as an immediate corollary, using 
Lemmas [3] and [9] 

Corollary 2. For every consistent LPTS of k states, there is 
a consistent stochastic partition of size at most k. 

So, we can obtain a minimum state consistent LPTS by 
constructing the quotient for a consistent stochastic partition 
of Sp of the least size. For example, H\, A G (0, 1), in Figure 
|6]is the quotient for a least sized consistent stochastic partition 
for the trees in Figure [5] (where si goes to group 1, S2 goes 



to group 2 with probability A and to group 1 with 1 — A and 
S3 and S4 go to group 2). We describe an algorithm to find 
a least-sized consistent stochastic partition by casting it as 
an instance of the satisfiability problem over linear rational 
arithmetic. 

Algorithm. The encoding is similar to the case of partitions 
in the previous subsection. To find a stochastic partition of 
size at most a given k, let g. t denote the group i for 1 < 
i < k. Introduce a non-negative rational variable vuimj to 
denote [s](gi)(gj) for every s G Sp, 1 < i, j < k. For every 
i and s G Sp, add the constraint ^X)i<j<fe v [s](i).j = lj V 

(j2i<j<k v [s](i),j = 0) to denote that [s](gi) is a distribution 
or is undefined. Then, we encode points 1 and 2 of Definition|9] 
by adding the constraint V[s°>](i),i = 1 f° r every i and P G V, 
making g± the start state of the quotient, and adding 

H V [s](i),j = 1 V \par{s)]{l),i > 

l<j<k l<l<k 

for every non-root state s and i. This ensures that the stochastic 
partition obtained is well-defined. 

Encoding consistency is the same as before except for 
vt n ,i,t p (t n , i and t p are as before) which will now be 

l<j<fe 

where d Pn ^ p .i denotes /i„ Cp lift(fi p , gi). Thus, we will 
check if there is a group of par(s p ) (summation over 1 < 
j < k) for which s p G gi and /i„ C fi lift(/i p , gi). For a j, 
lift(p P ,9i)(9j) is encoded as XU s , w(M w [s](i)J • /i p (s). Rest 
of the encoding is similar. 

We can similarly show the correctness of the encoding and 
the termination of the algorithm follows from Corollary [2] 

Theorem 3. The problem of learning a minimum state con- 
sistent LPTS with V and M is decidable. 

IV. Active Learning for LPTSes 

We now consider the problem of learning the language of 
an LPTS, i.e. learning an LPTS up to simulation equivalence 
(following Lemma|5]l, in the framework of active learning. Let 
U be an unknown target LPTS. The learning framework has 
a learner and a teacher. The goal of the learner is to learn an 
LPTS L such that L ~ U. To that effect, the learner maintains 
a hypothesis LPTS H. The process of learning proceeds in 
rounds where in each round, the learner makes a query to the 
teacher and updates H based on the response. For reasons 
mentioned in the introduction, we only consider a single type 
of queries in this paper where the learner conjectures H as 
(simulation) equivalent to U. In response to such a query, 
the teacher is expected to check whether H ~ U holds and 
otherwise, return a counterexample. If it is a counterexample 
to H -< U (U ^ H), it is called a negative (positive) 
counterexample. Following Section HU we assume that the 
counterexamples are always trees. Furthermore, there should 
always exist an LPTS consistent with all of the counterex- 
amples, i.e. simulating all the positive counterexamples and 



none of the negative counterexamples, received by the learner 
so far. Also, every conjecture H made by the learner should 
be consistent with the counterexamples received so far, in the 
above sense. 

Unfortunately, the framework, as described above, is too 
general to be useful, as the following lemma shows. 

Theorem 4. The problem of learning an unknown LPTS U is 
undecidable in the active learning framework. 

Proof Sketch: We show that there is no algorithm to 
learn the unknown target U\, which first performs an action 
a and goes to a state with (unknown) probability A to loop 
on action b or goes to another state with the remaining proba- 
bility to deadlock, by describing an adversarial teacher which 
manipulates the value of A as necessary to keep generating 
counterexamples. After choosing an initial value of A, the 
teacher returns a counterexample as long as the hypothesis 
is not simulation equivalent to the target. If a hypothesis 
simulation equivalent to the target is conjectured, the teacher 
increases the value of A just enough to have the new target 
not simulated by the hypothesis, while still being consistent 
with all the previously generated counterexamples, and a new 
(positive) counterexample can then be generated. ■ 

The main reason behind the theorem is that it is not 
necessary for the positive tree counterexamples returned by 
the teacher to have an execution mapping to U (see Section 
Hil l. Such a teacher can be seen as an adversary which can 
choose the probability values in the counterexamples returned, 
which are infinitely many, to make the learner never converge 
to the desired probabilities. 

But, in practice, to be able to apply the learning framework 
in a given setting, one needs to implement the teacher's 
algorithm and we are not aware of any algorithm to generate 
counterexamples other than the one discussed in Section [TT] As 
mentioned before, this algorithm has an interesting property 
that the generated counterexamples have an execution mapping 
to L\ when L\ < L2 fails. This suggests us to impose the 
following friendliness condition on a teacher. 

Condition 1 (Friendly Teacher). Every positive (negative) 
counterexample returned by the teacher should have an ex- 
ecution mapping to U (H). 

First of all, we observe that the proof of Theorem [4] no 
longer works because an update to A may violate Condition 
1 on any positive counterexample already returned. In fact, as 
we show below, the problem becomes decidable. Let V and 
M denote the sets of positive and negative counterexamples, 
returned by the teacher so far, respectively. First, consider the 
pseudo-code in Algorithm!]] It suggests a method of using the 
algorithms described in Section [Til] by treating V and Af as 
the tree samples. There is a choice at line 6 to use partitions 
or stochastic partitions. 

First, we show that using traditional partitions at line 6 
makes the problem of learning a target decidable. 

Lemma 11. The active learning loop of Algorithm\l\termi- 



Algorithm 1 Active Learning Loop. 

l: V=Af=V) 

2: H 4— single state LPTS with no transitions 

3: repeat 

4: conjecture H to the teacher 

5: update V and M from returned counterexamples, or exit 
6: obtain a least sized consistent (stochastic) partition II 
7: H <- V/U 

8: until false 



nates under Condition Q] on the teacher and using partitions 
at line 6 with the number of states of each intermediate 
hypothesis H bounded by that of U. 

Proof Sketch: Consider an arbitrary iteration of the 
learning loop. First of all, due to Condition [T] the quotient 
of the partition induced by the execution mappings from the 
positive counterexamples to U is a sub-structure of U and 
hence, is trivially simulated by U and is a consistent LPTS. 
As the algorithm finds a least-sized consistent partition, its 
size is bounded by \Su\- 

Then, notice that every future hypothesis is consistent with 
any new counterexample returned, and hence, is distinct from 
the current one. Moreover, due again to Condition Q] and as 
lift only adds probabilities, one can show that there are only 
finitely many possible distributions for a given partition size. 

We conclude that the algorithm terminates. ■ 

Thus, we have the following result. 

Theorem 5. The problem of learning an unknown LPTS is 
decidable in the active learning framework, with Condition^ 
on the teacher. 

It is sometimes desirable to learn an LPTS with the least 
number of states. While the algorithm described above learns 
an LPTS, it is not guaranteed to output a minimum state LPTS 
simply because each hypothesis need not have the least number 
of states (see Section IIII-Ab . This suggests us to impose the 
following condition on the learner. 

Condition 2 (Learner). Every hypothesis H made by the 
learner is a minimum state LPTS consistent with V and J\f. 

If there is a learning algorithm under Conditions Q] and 
then it is guaranteed to output a minimum state LPTS which is 
(simulation) equivalent to U. But, there is no such algorithm 
as we show below. 

Theorem 6. The problem of learning an unknown LPTS U 
is undecidable in the active learning framework, with both 
Condition Q] on the teacher and Condition [2] on the learner. 

Proof Sketch: We show that there is no algorithm to 
learn (unknown) H\ in Figure [6] by describing an adversarial 
teacher which can return a counterexample for any conjectured 
hypothesis. Initially, the teacher keeps returning negative coun- 
terexamples, if there are transitions on actions other than a, b 
and c in the hypothesis, or the positive counterexample P in 
Figure until the learner conjectures a single-state LPTS with 



self-loops on these three actions. Thereafter, if a conjectured 
hypothesis has transitions on only a, b and c and simulates 
P, the teacher returns N a to force the future hypotheses to 
have at least two states and in every future round, returns Nf, 
or A^' 7 in the figure, as necessary. One can show that there 
are always suitable values of f3 and 7 whenever N{? n needs 
to be returned and the learner always conjectures a two state 
LPTS. In fact, H\ is always a consistent LPTS for a suitable 
AG (0,1). ■ 
However, we obtain a semi-algorithm to the problem by 
using stochastic partitions at line 6 of Algorithm Q] That is, 
if the algorithm terminates, it is guaranteed to learn the target 
with the least number of states. Correctness is immediate from 
Theorem [3] 

V. Learning Assumptions for 
Compositional Reasoning 

As mentioned in the introduction, the original motivation for 
this work was to automate assume-guarantee style reasoning 
for simulation conformance. Assume-guarantee reasoning [25 1 
is a compositional technique that breaks up the verification of 
large systems into that of its components for increased scala- 
bility. When checking individual components, the method uses 
assumptions about their environments and discharges them 
on the rest of the system. For a system of two components, 
such reasoning is captured by the following simple assume- 
guarantee rule (ASym). 

Li || A < P L 2 <A 
T x \\L 2 <P 

Several other assume-guarantee rules have been proposed, 
some of them involving symmetric 1261 or circular reason- 
ing fH, E51, (20). Despite its simplicity, rule ASym has 
been proven most effective in practice and has been studied 
extensively mainly in a non-probabilistic setting, for different 
notions of conformance 1261 . (9), ifTBTl . 

In our case, L\, L 2 , A and P are LPTSes with P standing 
for the specification which the composition Lj || L 2 should 
conform to, where || is defined below. 

Definition 11 (Composition |28|). The parallel composition 
of Li and L 2 , denoted L\ || L 2 , is defined as the LPTS {Si x 

s 2, («i,S2),ai Ua 2 ,T) where (s 1 ,s 2 ) A H iff 

1) Si A Hi, s 2 A n 2 and \i = Hi ® M2. or 

2) si A Hi, a ^ a 2 and /1 = Hi ® <^s 2 . or 

3) a ^ ai, s 2 A H2 and \i = S Sl (g) Hi- 
Here vi®v 2 £z. Dist^i x 62), such that vi ® v 2 : (si, s 2 ) 1— > 
^l(si) ■ v 2 (s 2 ), for vi G Dist(S' 1 ), v 2 G Dist(S , 2 ). 

The main challenge in using assume-guarantee reasoning is 
to automatically come up with a small assumption A satisfying 
the premises. We first note that the proposed rule is sound and 
complete 1 19 1. Completeness, obtained trivially by replacing A 
with L 2 , is essential to guarantee termination of our proposed 
algorithm. Previous attempts at automating assume-guarantee 
reasoning using learning in a probabilistic setting have been 



restricted to checking probabilistic reachability properties us- 
ing either an incomplete rule lfT31 or algorithms which may 
not terminate fl4l . 

Motivated by the success of existing applications of active 
learning to assume-guarantee reasoning [26|, [9], [10|, we 
propose to use the active learning framework presented in 
Section |IV] to learn an intermediate assumption A in the 
rule ASym. We describe an algorithm for the problem using 
learning and show termination below. 

Teacher. The teacher is implemented by two conformance 
checks corresponding to the two premises of the rule, checked 
in any order. 

• Premise 1 guides the learner towards a conjecture that 
makes Li || A X P true. 

• Premise 2 guides the learner towards a conjecture that is 
discharged on L 2 , i.e. that makes L 2 < A true. 

If the conjectured A satisfies both the premises, soundness 
of ASym implies Li \\ L 2 ^< P holds, and the teacher 
returns true. If one of the premises fails, the teacher generates 
counterexamples with an execution mapping (Section HP) , 
Thus, the teacher satisfies Condition Q] When premise 2 fails, 
a positive counterexample is returned to the learner. When 
premise 1 fails, the obtained counterexample is first projected 
onto A and then returned as a negative counterexample. As a 
counterexample C to premise 1 has an execution mapping to 
Li || A, the projection onto A is simply the contribution of 
A towards C in the composition. To enable this, additional 
information regarding individual distributions is maintained 
during composition 1191 . 

Spuriousness Check. Note that if Li \\ L 2 ^ P, no 

assumption satisfies both the premises of ASym (violating 
the assumption on the existence of a consistent LPTS in 
Section fTTTb . To detect this, the learner needs to check if a 
counterexample returned by the teacher exposes the failure of 
the conclusion of ASym. A real counterexample would imply 
that the specification will not hold of the original system while 
a spurious one would need the learner to revise its hypothesis 
for the assumption. We restrict spuriousness check to negative 
counterexamples following previous approaches [26]. A simple 
way is to check N < L 2 for a negative counterexample N. N 
is real if the check succeeds and spurious, otherwise. A slightly 
more involved, but practical, way is described elsewhere lfT9l . 
Algorithm. Now, the learner can simply use Algorithm Q] 
using partitions, to learn an intermediate assumption. As the 
positive (negative) counterexamples have execution mapping 
to L 2 (A), it is as if the unknown target is L 2 . Note that if 
P holds of the system, L 2 is clearly an assumption satisfying 
the premises. However, the algorithm is expected to terminate 
with a smaller assumption in practice, which also satisfies the 
premises. If P does not hold, the algorithm terminates with 
a real counterexample. Termination is guaranteed by Lemma 
fTTI If we also impose Condition |2j the learner uses stochastic 
partitions in Algorithm Q] giving a semi-algorithm. 
Complexity Analysis. Let us now analyze the complexity 
of assume-guarantee reasoning using the learning algorithm 
described above (with partitions). The complexity of checking 



Li \\ L 2 < P directly is 0{poly{\L x \ ■ \L 2 \, \P\)), where \L\ 
denotes max(|S , L|, \tl\). 

Let d — \t 2 \ and b be the maximum size of the support of 
a distribution in L 2 . Given a state of a candidate assumption 
of size k and a distribution of L 2 , there can be at most k b - 
many corresponding distributions (due to non-determinism) 
from that state. For k states and d distributions, this gives 
a total of dk b+1 . Therefore, there are 2 dkb+1 different possible 
candidates of size k to consider. The total number of iterations 
of the learning algorithm is then bounded by J2T=i ^ dk = 
0(m2 dm ), where m is the number of states in the final 
assumption output by the algorithm. 

At each iteration, in the worst-case, the algorithm enumer- 
ates all the candidate assumptions of the current size k and 
performs simulation checks with all the negative counterexam- 
ples. These checks have a complexity of 0(poly(\A\, \Af\,l)), 
where A is the final assumption, J\f is the final set of negative 
counterexamples and I is the largest |JV"|, for any TV G M. 
Thus, the total worst-case complexity of the learning algorithm 
for computing the final assumption is 0(poly(\A\,\Af\,l) ■ 
m2 dm ). Furthermore, the complexity of checking the two 
premises of ASym is 0(poly( \L± \ ■ \A\, \P\) +poly(\L 2 \, |P|)) 
at every iteration. We observe that in practice, if the assump- 
tion is small (i.e. \A\ <C \L 2 \) this approach can be better than 
checking L\ \\ L 2 directly. In other cases, however, we would 
need better algorithms to address the problem. We leave this 
for future work. 

VI. Conclusion 

We have presented algorithms and decidability results 
for the problem of learning non-deterministic LPTSes from 
stochastic tree samples, using traditional and stochastic state- 
space partitioning. We have also described the application of 
the algorithms to automating the discovery of assumptions for 
the compositional verification of LPTSes. 

In the future, we would like to investigate further conditions 
on the teacher that will make the active learning problem with 
stochastic partitions decidable. We also plan to investigate 
the use of weak simulation for the conformance relation, 
as this will result in smaller assumptions for compositional 
verification. However, algorithms for checking weak simula- 
tion are not currently known. Finally we plan to investigate 
new applications for our algorithms in learning abstractions or 
active model checking and in domains other than verification. 
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Appendix 

A. Proof of Lemma |2] 

By Definition [3] < is the union of all strong simulations. It 
can easily be shown that union of two strong simulations is 
a strong simulation and hence < is a strong simulation. It is 
also the coarsest as it includes any strong simulation. ■ 



B. Proof of Lemma [4] 

It suffices to show that R C< and R. 

R is clearly a strong simulation which, by Lemma|2l implies 

R C< 

To prove the other direction, let si ^ sa. We show that 
S1RS2 by induction on the height of S\ in the tree L\, where 
the height of a leaf state is defined to be and the height of 
any other state is defined to be one plus the maximum height 
of any state in the support of any outgoing distribution from 
that state. 

For the base case, let si be any leaf state. As s\ has no 
outgoing transitions, s\Rs2 trivially holds by the assumption 
on R. 

For the inductive case, let the height of s\ be non-zero and 
let si —> hi. Then, as ^ is a strong simulation (Lemma [2]), 
there exists p 2 with s 2 — > P2 such that pi p 2 . Let S C 
Supp(pi). We then have (J,i(S) < /tt2(d (S)). As every state 
in Supp(pi), and hence in S, has a smaller height than that 
of si, by induction hypothesis, ^ (S) C R(S) and therefore, 
A*i(S) < p2(R(S)). As S is arbitrary, we conclude that fii Efl 
H2- By the assumption on R, we conclude that siRs2- 

Thus, by induction, we conclude that ;<C R. ■ 



C. Proof of Lemma [7| 

Let P G V . As P < L, there is a strong simulation Rp C 
Sp x Sl with SpRps^. As P is a tree, s P is not in the 
support of any distribution and hence, assume without loss of 
generality that R P {s%) = {s° L }. Let R = \J PeV Rp. Now, 
R induces a partition II of S-p such that for Si,S2 £ Sp, 
[si]n = [s 2 ]n iff ^(si) = A(s 2 ). Note that [4] n = [s^] n 
for P,Q gP, satisfying the assumption on IT in Definition [7] 
The size of 11 is clearly bounded by 2 fe . 

We first show that the relation R 1 = {([s p ]n, si)\s p Rsi) is 
a strong simulation. Let eR'si and e A /i. By Definition |7] 
there exists s p £ Sp and /x p £ Dist(S-p) with [s p ]n = e > 
s p A /i p and jtx(e') = X) s 'Ge' Mp( s f° r a ^ e ' e ^- By the 
definition of i?' and II, R(si) — R(s2) for all si, S2 £ e and 
hence, s p Rsi. As i? is the disjoint union of strong simulations, 
there exists pi £ Dist(SL) such that A /i; and /i p C^j p t . 
Let C Supp(p). Now, 



= E Me') 

e'EE' 

— 2_. Mp({ s S 5p|[s]n = e'}) {choice of jj,} 

e'EE' 

=fji p {{s e S v \[s]n eE'}) 

<fj,i(R({s e SvMu e E'})) {fj-pQR/j-i} 
= W ( |J R({s&S P \[s} n = e'})) 



e'EE' 

= W ( U i? '( e ')) 

e'S-E' 



{Def. of 



So, by Lemma [T] /i Cp, /i;. We conclude that i?' is a strong 
simulation. For an arbitrary P E V, as s P Rs L and as Spy n = 
[s p ] n (Definition Hi, s v/n R's° L . Therefore, V /U <L.m 

D. Proof of Lemma [S] 

Let (g, a, fi) S r-p /n be arbitrary. It suffices to show that 
fi E Dist(G). This immediately implies that P/H is an LPTS, 
according to Definition \T\ Let (s, a, /i p ) E rp for some P E V 
such that s E g and /i = lift(fi p ,g) as in Definition [10] Now, 

= E E^XsVM*') 

g'EGs'eg' 

= E i^( s ')- E M(5)(</) 

s'es-p y g'-s'eg' 

= E (mp(* / )' E {Definition H 

s'eS-p \ g':[s'](g)(g')>o 

= E M*') {W}(g) e Dist(G)} 

s'ES-p 

= 1 {/i p e DKr(S'p)}. 



£. Proof of Lemma \9\ 

Let P E V. We first show that the relation R — {(s,g)\g E 
G, s E Sp n g} is a strong simulation. 

Let sRg and s A fi p . As s g 5, by Definition [Tol g A /i 
where for every g' E G, 

Kg')= EMW) •/%,«)• 

s'Eg' 

It suffices to show that /i p Cp fi. Let C Supp(fj, p ). Now, 
/ip(S') 



= E M s ') 

= E E M(ff)G/') {[s'}(g)eDist(G)} 

s'eS g':s'eg' 

= E E M(.9)(.9')-^( S ') 

g'EG s'ESflg' 

= E E WW) 

g'ER(S) s'ESCig' 

{definition of R} 

< E E^KsO-M*') 

g'ER(S) s'Eg' 

= E { choice of a4 

g'ER(S) 

= KR(S)) 



So, by Lemma Q] /t p Cp /i. We conclude that R is a strong 
simulation. From Definitions [9] and [lO] Sp € Sp/ n and hence, 
s p i?s°, /n . Therefore, P ■< V /U. ■ 



Proof of Lemma [70| 

Let P e P. As P ^ I, there is a strong simulation 
R P CSpX S L with s p Rps° L . Let i? = U Pe p For si?s ; 
and s A there can be one or more transitions s; A /i; 
with /ip Cjj /i;. We assume that we can always choose a 
unique s; A /i; with /i p Cp /i; (say, by ordering the possible 
transitions in some way and choosing the first) and also that 
we can always choose a unique weight function w satisfying 
the conditions of Definition |2] for fx p C fl 

Create a group of states of Sp for each si E Sl, say 7(s;), 
initialized to and let T be the set of all these groups. We 
will populate these groups by induction on the depth of a 
state in Sp with s E 7(s;) implying sRsi. We will also define 
ip(s) : T — > Dist(T) for each s E Sp by the same induction. 
Let s E Sp be arbitrary. We proceed by induction on d(s), 
the depth of s. 

The base case is when d(s) = implying s is a start state. 
s is added to 7(s°) and ip(s) maps every g E T to S-y^l)- 
Clearly, sRs L and ip(s)(g)(j(s ( l)) > for every g E T. 

For the inductive step, d(s) > and let g E T. If par(s) £ g, 
ip(s)(g) is undefined. Otherwise, let s; 6 Sp be the unique 
state satisfying g = ~f(si). Thus, par(s) E j(si) and by 
induction hypothesis, par(s)Rsi. Let par(s) — > /i p be the 
unique transition with s E Supp{n p ) (as par(s) is unique). 
As R is the disjoint union of strong simulations, choose 
si — > /i; with /ip Cp /i; as mentioned in the beginning in 
a unique way. Furthermore, let w be the uniquely chosen 
weight function satisfying the conditions in Definition |2] for 
A*p ^R ^i- F° r every s[ E Sl with w(s,sj) > 0, define 
</?(s)(<7)(7(sJ)) = w(s,s' l )/fi p (s) and add s to 7(sJ). Now, 
w;(s, S;) > implies sRs[ by Definition [2] The definition 



also says that J2 S '£S w ( s i s 'i) = ^pi 3 ) which implies that 
tp(s)(g) S Dist(T)! Clearly, ¥>(«)(<?) (7^)) > 0. 

That completes populating T and defining ip(s) for every 
state s G Sp. Note that if ip(s)(g) is defined, then par(s) 6 
<? from the above construction and hence, g is non-empty. 
Furthermore, every group g in Supp(ip(s)(g)) contains s, again 
from the construction above, and hence, is non-empty. 

Now, define a stochastic partition II = (G, {[s]} s eS?>) with 
G containing all the non-empty groups of T and [s] given by 
ip(s). It is not difficult to see that II is well-defined according 
to Definition [9] First of all, one can easily show, using the 
same induction above, that every state is added to some group 
and hence [JG = Sp. Then, as discussed above, (p(s) is only 
defined for groups in G and the support of any distribution 
in the range set of (p(s) is contained in G and hence, tp(s) : 
G Dist(G). 7(s° ) is the g° in Definition [9] Also, from the 
way we populated groups in G, the condition that s G g iff 
there exists j'eG such that [s](g')(g) > follows for every 
s G S-p and g G G. 

We will now show that V/H r< L by first proving that 
R' = {(g,si)\g G G,<? = 7(s;)} is a strong simulation. Let 
gR'si and 5 — > p. By Definition [10] there exists s — > /i p in 
some P £V with s6j such that for every g' G G, 

M<y) = 51 [ s '](5)(ff) -M s ')- 

s'Eg' 

By definition of i?', g = 7(s;) and hence, s G 7(s;)- From 
the above construction of II, we can then infer sRsi. Now, 
choose si A fit with p p \— R pi as mentioned in the beginning 
in a unique way. It suffices to show that p pi. Let w be 
the uniquely chosen weight function to show that p p \Z R pi. 
Let 7(s{) G Supp(fi). Then, p(^(s[)) 

= E [ s 'K.9)(7(s;))-Ms') 

s'G7(sJ) 

{choice of p above} 

E [ s '](.9)(7(si)) -M s ') 
s'eSHpp(^p)n7(s;) 

s'£Sl(pp(/J p ) 

{from the above construction of [s'}} 

= w(*l) 

{Definition 12} 

So, = m(R'(g')) for every 9' G Supp(p). As i?' 

maps distinct groups in G to distinct states of Sl, it follows 
that p \Z R/ pi (by exhibiting the trivial weight function). 
We conclude that R! is a strong simulation. Clearly, Sp/ n = 
-/(s° L )R's° L . Therefore, V /II ^ L. Also, |G| <\S L \ = k.m 

G. Proof of Theorem [4] 

We give an example where it is impossible for the learner 
to converge to the unknown target, up to ~, in presence of an 
adversarial teacher. 




Fig. 7: There is no learner for the target U\ in presence of an unrestricted 
teacher. 



Consider U\ in Figure Q where A G (0, 1). For a fixed 
A, U\ is an LPTS with the alphabet {a, b}. The strategy for 
an adversarial teacher is described in Algorithm [2] which is 
briefly summarized in words below. Let U\, for some unknown 
A, be the unknown target and H n be the hypothesis at the 
beginning of every round n > 1 of the active learning loop 
(we count rounds beginning with 1). The teacher acts as an 
adversary by manipulating the value of A as necessary and it 
suffices to show that there is some LPTS consistent with all the 
counterexamples generated so far. So, let A„ be the value of A 
at the beginning of round n and let /i„ be the corresponding 
distribution on a. 

In every round n, the teacher first checks H„ < U\, 
returning a negative counterexample if it fails, and then checks 
U\ < H n , returning a positive counterexample if it fails. If 
both checks succeed (i.e. U\ ~ H n ), the teacher modifies 
the value of A such that U\ n ^ U\ but not the other way 
around. This is achieved by incrementing its value at line 
15, where Dist a [Af] is the set of distributions labeled by a 
in M. First, it computes A + which is the least of all p^'s, 
greater than A, and 1 where /1 is any distribution appearing 
in a transition of any negative counterexample labeled by a 
and p^ is the measure, under p, of all the states having a 
transition on b. It then updates A to the mean of A and A + , i.e. 
A„+i = (A„ + A+)/2. After this update, as A > A„, U\ n < U\ 
holds but U\ j2 U\ n and hence, U\ ^ H n . This ensures that 
a positive counterexample P always exists, justifying line 16. 

Now, it is easy to see that A + at line 14 is well-defined and 
always exists. Thus, the teacher can return a counterexample 
for every hypothesis made by the learner. 

We will now show that U\ n is consistent with V and M at 
the beginning of each round n > 1 by induction on n, where V 
and Af are the sets of positive and negative counterexamples, 
respectively. For n = 1, V U Af = and hence, U\ 1 is 
consistent. 

Assume that U\ m is consistent with VUAf for some m > 1. 
If a negative (positive) counterexample N (P) is added to Af 
(V) at line 7 (11), TV ^ U Xm (P r< U X J by Definition |6] As 
^A m = U\ m+1 , U\ m+1 is consistent with V and Af. Now, let 
P be a positive counterexample added to V at line 17. Clearly, 
P d: U\ m+1 by Definition [6] Also, by induction hypothesis, 
for everyV € V \ {P}, P' d U Xm and as U Xm d U Xm+1 
(from above), we obtain P' < U\ m+1 from Lemma [3] Let 

G Af. By induction hypothesis, N ^ U\ m and we need to 
show that N ^ U\ m+1 . For the sake of contradiction, assume 
that N ^ U Xm+1 ■ 



Now, every transition outgoing from s% is labeled by a, 
as the only transition outgoing from the start state of U\ m+1 
is labeled by a. Let s° N A v. So, v \i m +\- No state in 
Supp(v) has a transition labeled by an action other than b, 
as otherwise, v %^ ft m +i- That is, every state s in Supp(v) 
either has no outgoing transition or has a transition labeled 
by b. One can easily argue that this is also the case for any 
transition outgoing from s and so on. Consider^, the measure 
of all the states having a transition on b under v. We have that 
Pb < A m +i, as otherwise, v (i m +i- 

If Pb ^ ^roi clearly N < U\ m which leads to a contradic- 
tion. So, p^ > A m . But then, by construction of U\ jn+1 (line 
14 of Algorithm I3, A m +i < p 1 ^ leading to a contradiction. 

We conclude that N ^ U\ m+1 . This completes the inductive 
step. Intuitively, whenever A is updated at line 15, it is as 
if the unknown target is U\ from the beginning and no 
inconsistencies arise. 

Hence, the learner keeps receiving counterexamples and will 
never converge to the unknown target. ■ 



Algorithm 2 An adversarial teacher in the proof of Theorem 

i 

1: n 4- 1 

2: A <— arbitrary rational in (0, 1) 
3: AT 0, V <- 
4: repeat 

5: if H n 2< U x then 

6: let N be a tree counterexample (Def. |6j 

7: N^NU{N} 

8: return TV to the learner as a negative counterexample 

9: else if U\ £ H n then 
10: let P be a tree counterexample (Def. |6) 
11: ?^PU{P} 

12: return P to the learner as a positive counterexample 
13: else 

14: A+ = min > A | fj, G Dist a [N]} U {1}) 

15: A^(A+ + A)/2 

16: let P be a tree counterexample to U\ ^ H n (Def. |6j 
17: P^PU {P} 

18: return P to the learner as a positive counterexample 

19: end if 

20: n «- n + 1 

21: until false 



i/. Proof of Theorem [6] 

We give an example where it is impossible for the learner to 
converge to the target, up to ~, in presence of an adversarial 
teacher. 

Consider Hi in Figure [6] as the unknown target U and let 
H n be the hypothesis at the beginning of each round n > 1 
(we count rounds beginning with 1) of the active learning loop. 
We describe a strategy of a teacher below to keep generating 
counterexamples no matter what the conjectured hypothesis is. 

By Condition |2] Hi is an LPTS with a single state, which 
is also the start state. Initially, in every round n > 1, the 
teacher first checks if H n has a transition on an action other 
than a, b or c in which case, clearly, H n ^ U and a negative 
tree counterexample is returned using the algorithm sketched 



in Section [EI] Then, the teacher checks P < H n and returns 
P as a positive tree counterexample if it fails where P is in 
Figure [5] Note that P has an execution mapping to U and 
hence, the teacher satisfies Condition [TJ According to this 
strategy, the learner keeps receiving negative counterexamples 
for transitions on actions other than a, b and c or the positive 
counterexample P which can go on forever, in which case 
we are done, or its hypothesis converges to the LPTS H* 
(disallowing duplicate transitions) with a single state and Dirac 
self-loops on a, b and c. We will assume the latter, i.e. the 
learner conjectures H* after some finite number of rounds. 
Note that it is possible that P has not yet been returned as a 
positive counterexample to the learner. 

At this point, the teacher returns N a in Figure|5]as a negative 
counterexample. This forces every future hypothesis to have 
at least two states. In fact, the LPTS H\ with two states in 
Figure |6] for any < A < 1 is a consistent hypothesis. By 
Condition [2] the next hypothesis has only two states. Now, 
we describe the teacher's strategy for future rounds. For this 
strategy, we show that a consistent LPTS of two states exists 
and that a counterexample can be returned, in every round. 
So, let si and S2 be the two states of the hypothesis with si 
being the start state. Furthermore, let A^, A l b , and A* be the 
sets of distributions outgoing from s,*, i = 1,2, on actions a, 
b and c, respectively. The teacher's strategy proceeds in every 
future round is as follows. 

1) As in the initial strategy, it first checks if there is a 
reachable state in the hypothesis with a transition on 
an action other than a, b and c and returns a negative 
counterexample (see Section HJl if there is one. 

2) Then, it checks P -< H n and returns P as a positive 
counterexample if it fails. 

3) At this point, P ■< H„ and 7V a ^ H„ hold (H„ is 
consistent with them) and we infer the following. 

(i) A* ^ and for every fi a e A*, fM a (si) < 1 and 

(ii) A^ and for every /ib & A£ and every S{ £ 
Supp(fib), A* ^ 0. 

The teacher, therefore, does the following. 

a) If there is a Hf, G Aj with fib(si) = 1, it returns Nb 
in Figure [5] as a negative counterexample. Clearly, Nb 
has an execution mapping to H n . 

b) Otherwise, there exists a /ib £ A^ with (ib(s2) > 0, 
implying A^ / and Ng n in Figure |5]is returned as a 
negative counterexample, where f3 — /i a (s2) for some 
\i a G A^ and 7 = /i c (s2) for some fi c G A^. Again, 

has an execution mapping to H n . 
Clearly, except for a counterexample generated in case 3(b) 
above, H\ is a consistent hypothesis for any A G (0,1). For 
case 3(b), H\ with < A < (3 is consistent. So, after any 
round, H\ with A set to a value smaller than the least j3 of 
any returned is consistent and such a A always exists as 
there are infinite rationals in (0,1). Thus, Condition forces 
the learner to always conjecture a two state LPTS and hence, 
it keeps receiving counterexamples and will never converge to 

u. m 



